Identifying and Assessing Interesting Subgroups in a Heterogeneous Population

نویسندگان

  • Woojoo Lee
  • Andrey Alexeyenko
  • Maria Pernemalm
  • Justine Guegan
  • Philippe Dessen
  • Vladimir Lazar
  • Janne Lehtiö
  • Yudi Pawitan
چکیده

Biological heterogeneity is common in many diseases and it is often the reason for therapeutic failures. Thus, there is great interest in classifying a disease into subtypes that have clinical significance in terms of prognosis or therapy response. One of the most popular methods to uncover unrecognized subtypes is cluster analysis. However, classical clustering methods such as k-means clustering or hierarchical clustering are not guaranteed to produce clinically interesting subtypes. This could be because the main statistical variability--the basis of cluster generation--is dominated by genes not associated with the clinical phenotype of interest. Furthermore, a strong prognostic factor might be relevant for a certain subgroup but not for the whole population; thus an analysis of the whole sample may not reveal this prognostic factor. To address these problems we investigate methods to identify and assess clinically interesting subgroups in a heterogeneous population. The identification step uses a clustering algorithm and to assess significance we use a false discovery rate- (FDR-) based measure. Under the heterogeneity condition the standard FDR estimate is shown to overestimate the true FDR value, but this is remedied by an improved FDR estimation procedure. As illustrations, two real data examples from gene expression studies of lung cancer are provided.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Decision trees in epidemiological research

BACKGROUND In many studies, it is of interest to identify population subgroups that are relatively homogeneous with respect to an outcome. The nature of these subgroups can provide insight into effect mechanisms and suggest targets for tailored interventions. However, identifying relevant subgroups can be challenging with standard statistical methods. MAIN TEXT We review the literature on dec...

متن کامل

Identifying different subgroups of rabies virus in Iran

Abstract: Background: Rabies has been reported in all provinces and cities of Iran, although there has been no molecular study regarding different groups and subgroups of rabies virus by phosphoprotein gene. In this study, firstly, 48 and then 85 recent rabies isolates recovered from cases reported throughout Iran identified the evolutionary origins by molecular method of phosphoprotein gene r...

متن کامل

تأثیر روش‌های سنجش تکوینی بر پیشرفت تحصیلی دانش‌آموزان کلاس چهارم ابتدایی در درس علوم تجربی

Formative assessment aims at assessing and evaluating the students' educational achievements, determining their learning weaknesses and strengths, identifying the flaws of teacher training methods and improving learning quality. Therefore, the present paper has studied the relationship between formative assessment and the fourth grade students' learning achievements in science, in Hamadan. Th...

متن کامل

Serious juvenile offenders: classification into subgroups based on static and dynamic charateristics

Background The population in juvenile justice institutions is heterogeneous, as juveniles display a large variety of individual, psychological and social problems. This variety of risk factors and personal characteristics complicates treatment planning. Insight into subgroups and specific profiles of problems in serious juvenile offenders is helpful in identifying important treatment indicators...

متن کامل

Identifying Tools and Methods For Risk Identification and Assessment in Construction Supply Chain

The construction project is a business full of risk in every process due to its complexity, changes, and involvement from various stakeholders. One of the critical risks in the construction project is in the supply chain. Identifying and assessing the risk with the right tools and methods in that area will inevitably affect the success of the project. Unfortunately, the research for the tools a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 2015  شماره 

صفحات  -

تاریخ انتشار 2015